Periodic Reporting for period 1 - DATAMITE (DATA Monetization, Interoperability, Trading & Exchange)
Okres sprawozdawczy: 2023-01-01 do 2024-06-30
DATAMITE empowers European companies by delivering a modular, open-source, and multi-domain framework to improve DATA Monetizing, Interoperability, Trading and Exchange, in software modules, training, and business materials.
The DATAMITE project develops a simple but impactful technical framework that enables European enterprises and public administrations to overcome existing challenges and facilitate the monetisation of their data. The core objectives consist of helping users to better monetise, govern and enhance the trust of their data by developing a set of key modules: Data Governance, Quality, Security, Sharing & Supporting Tools. Interoperability with current leading storage technologies is achieved by building them on top of existing open-source components.
DATAMITE will validate the results in 3 different use cases with a total of 6 pilots, demonstrating that the framework is interoperable and usable in different domains and user needs, such as 1) Intra-corporate, multi-domain data exchange; 2) Data trading among Data Spaces; 3) Integration with other initiatives as Data Markets, EU AI-on-demand platform, or DIHs. Sectors covered by the pilots are agriculture, energy, industrial and manufacturing, and climate.
To achieve this, the project relies on a consortium of 27 partners from 13 countries, bringing together key actors of the Data Value Chain: Data Spaces technical and business stakeholders, multiple key communities, key experts in Legal and SSH aspects to guarantee legal and societal compliance, and facilitators on open-source community building and standardisation activities to accelerate the transfer to the market.
The main achievement during this period has been devising DATAMITE’s architecture. The project is quite ambitious in terms of functionality and services to be provided to users, and bringing all together into a single framework while trying to keep it as modular as possible was a complex task. Although all the technical tasks have their complexity, I’d remark on the efforts made in devising the metadata model, which tries to give room to users to enrich data in an inherent way (as we consider it should be) providing the business view through the usage of vocabularies. Along these lines comes the work performed to extend DQV to describe Data Quality information derived from user-defined rules for which we will consider standardisation. Also remarkable is the work performed in the Data Sovereignty component to create tools that facilitate the creation of policies while keeping in mind its enforcement, especially through the use of the EDC connector that will be integrated. Regarding data sharing, the proposed approach of not focusing only on EDC and dataspaces or Gaia-X has proved valid, as publishing data to different portals (e.g. in the several pilots) has gained importance as new possibilities have arisen, not constraining the project to initiatives that may still be in incipient stages.
Although not all components can be yet integrated into a common flow, the number of components successfully interacting already is promising, opening the door for more elaborate results in the second half of the project.
The main result that can be presented is the architecture, which illustrates how DATAMITE improves current alternative open-source approaches. Also, the metadata model, based on DCAT and extending it, can be seen as a relevant contribution, as current metadata vocabularies are mainly thought for data publication into public market-like catalogues but not that much for intra-company cataloguing and exploitation. DATAMITE’s metadata model leverages DCAT and extends it to offer a finer-grain detail, especially towards multi-artifact datasets with different levels of complexity.
Additionally, regarding data quality, DATAMITE is working on extending the DQV protocol/standard for Quality metadata to include user-defined metrics and rules. This extension is quite advanced and denoted DDQV (DATAMITE DQV) and will be proposed to the community. Similarly, we are also working on proposing a series of data quality categories and dimensions as part of this standard, given the lack of a standard approach on this matter.
Finally, project total success will be partially conditioned by the maturity of tools, like the already mentioned EDC connectors, or the level of usability that we achieve with DATAMITE’s catalogue. Likewise, properly creating data products remains challenging if considered beyond static datasets, i.e. data services, for which quality estimations must be provided.