Skip to main content

Enabling the European Business Graph for Innovative Data Products and Services

Periodic Reporting for period 2 - euBusinessGraph (Enabling the European Business Graph for Innovative Data Products and Services)

Reporting period: 2018-01-01 to 2019-06-30

Corporate data is an ever-increasing asset in the digitalisation of business and society, and its use is extremely significant in many business sectors (e.g. business information, marketing and sales, business publishing) and societal activities (e.g. transparency and accountability). The integration of company-related data from authoritative and non-authoritative public and private sector sources is a difficult and expensive task that hinders cross-sectorial innovation. Addressing this problem in a coherent and unifying way represents a business opportunity for a wide range of companies in the data economy. euBusinessGraph aims to create the foundations of a European cross-border and cross-lingual business graph by developing and integrating technologies for aggregating, linking, provisioning and analysing company-related data, thereby demonstrating innovation across sectors where company-related data value chains are relevant.

euBusinessGraph was a timely project given that corporate data have a high potential. The project successfully completed and achieved all of its objectives and milestones. The project delivered innovative software and services to the market and strengthened the competitiveness and growth of the companies participating in the consortium. The project delivered six important business cases that have demonstrated already substantial impact and have high potential for growth.
The euBusinessGraph project provided the necessary foundation for a knowledge graph of Europe-wide company-related information. We developed a Company Data Model (i.e. vocabulary) that covers companies, company types, status, jurisdictions, addresses, location data, classifications of economic activities, company registrations (in official and alternate registers), relevant social data (e.g. websites), and company officers and the nature of their relationships with their companies. In developing the company model we reused other appropriate ontologies (e.g. W3C Org, RegOrg and schema.org). Furthermore, we developed a model to create euBusinessGraph company identifiers. To support the data onboarding, we designed and implemented a set of data ingestion services integrated in the euBusinessGraph platform. The data ingestion services support data import, data cleaning, data mapping and transformation, data enrichment, schema-level semantic annotation, knowledge graph vocabulary faceted search and statistics, data hosting and queries, analytics, multi-lingual annotation, as well as other operational services (e.g. marketplace portal). In addition, we defined RDF Shapes to validate the onboarded data. As part of the work on the business graph we integrated and deployed selected datasets from the data providers in accordance with the euBusinessGraph Company Data Model.
Apart from the technical results, the project finalized the development of six data-driven business products and services based on company-related data value chains across domains that can be replicated throughout Europe. In particular, 1) the Corporate Events Data access, which integrates public corporate register data with the OpenCorporates.com database, 2) the Tender Discovery Service, which is a service for supporting companies in discovering new open tender opportunities tailored to their company profiles, 3) the Atoka+ B2B lead generation service, 4) the Customer Relationship Management Service (CRM-S), which leverages business data to establish new lines of business, 5) the Data Journalism Product Service, which supports journalists in dealing with complex and large volumes of company related data across the three journalistic workflows: search, monitoring and content production, and 6) the Norwegian Public Registries API service, which improves accessibility of several (currently disconnected) major Norwegian authoritative public sector registers.
euBusinessGraph advanced the state of the art both in the area of technical infrastructures for the business graph and in the specific domains of the six data-driven products and services.
The technical infrastructure realized in the euBusinessGraph platform is composed of: 1) data ingestion services (i.e. DataGraft, Grafterizer 2.0 ASIA and ABSTAT) to onboard data from different providers as a business graph according to the euBusinessGraph Company Data Model (vocabulary); 2) cross-cutting business analytics services (i.e. Wikifier, EventRegistry, Graph based analytics API, Relation tracker and TWEC) on top of the business graph; and 3) marketplace and data hosting services (i.e. the marketplace portal and GraphDB Cloud) to host and provide access to the business graph. The marketplace portal includes functionalities such as graphical user interface, faceted search, analytics, and company profiles.
euBusinessGraph defined an innovative vocabulary (i.e. the Company Data Model) for representing company-related data, covering key aspects of basic information about companies (e.g. company names, types, status, jurisdictions, addresses, classifications of economic activities, company registrations, company identifiers, company officers, etc.). Data from a selection of jurisdictions were onboarded and published according to the developed vocabulary to demonstrate the usefulness of the vocabulary as a mechanism to harmonize and integrate company-related data into a business graph across jurisdictions and across data providers.
The six data-driven products and services advanced the state of the art in their respective domains by developing innovative solutions for dealing with company-related data, with potential to impact on both the public and private sectors. The Corporate Events Data access (CED) service was refined through several iterations leading up to an official launch in June 2019, enabling its users access to information about corporate events. The Tender Discovery Service was finalized, enabling its users to discover new active open tender opportunities based on rich company profiles. The Atoka+ service was finalized, extending the coverage of the Atoka Lead Generation Platform beyond the Italian market, enabling users access to rich information about companies in several jurisdictions that weren’t available at the beginning of the project. The CRM-S solution was enriched with analytics platform infrastructure for continuously running and updating machine learning models, while at the same time improved the credit risk model for Norway. The Data Journalism Product - the Screener Tool – has been finalized, using the euBusinessGraph marketplace and data hosting APIs, allowing easier access to company data for data journalists. Finally, the BR-S service was released in production, and the information available in the service has been increased incrementally.
euBusinessGraph defined exploitation strategies for each business product/service as well as for each technical component developed as part of the project. Furthermore, the project was active during the second reporting period in disseminating the work in the project at various events, produced scientific publications, and contributed to supervision of students.
eubg-logo-final.png