Skip to main content
Ir a la página de inicio de la Comisión Europea (se abrirá en una nueva ventana)
español español
CORDIS - Resultados de investigaciones de la UE
CORDIS

Negotiating World Research Data: A science diplomacy study

Periodic Reporting for period 2 - NEWORLDatA (Negotiating World Research Data: A science diplomacy study)

Período documentado: 2023-07-01 hasta 2024-12-31

We have entered an age of data overload fuelling planetary concerns such as heightened surveillance, invasive predictive technologies, and even scientific uncertainty in tackling global warming. Yet while the current data deluge might be new, international organizations administering data in general, and scientific data more specifically, have been around for some time. Their present infrastructure reveals, actually, that, rather than data deluge, a main feature of their operations is the shaping of global imbalances in data access and distribution. This is because the global data infrastructure that underpins the world scientific enterprise is not evenly distributed geographically but it is rather 'lumpy', i.e. denser in a few world regions (mainly in the Northern Hemisphere) where most datasets are hosted and where most groups utilizing these datasets in data-intensive research are located. This co-location actually makes the global scientific network even more uneven as a selected group of users have opportunities and capacity to access the data, and they do so more intensely than in other regions of the world. Hence groups and organizations sitting at the centre of the global scientific network strengthen the central positions they occupy, and those at the periphery tend to be further marginalized. This is a problem with a past, in that the current asymmetries in the global distribution of datasets are connected to the evolution of world research data organizations and the selective criteria they have adopted, especially with regards to membership and hosting of world data repositories.

Neworld@a aims to better understand these global imbalances as they are now, and also to chart their historical legacy. To better understand the nature of this issue, we are undertaking a study combining scientometrics and historical approaches. The scientometrics study aims to provide quantitative and visual evidence of the current unevenness of the scientific data infrastructure through a combination of world and multi-layered network maps (see attached image) displaying regions of the world that are more or less densely connected to scientific datasets available, and can more easily use them in data-intensive research. This study is then combined with a parallel analysis of international treaties facilitating data exchange, in order to figure out if legal provisions currently existing for data-sharing, internationally, play a part in shaping these asymmetries, especially by accelerating exchanges between specific groups and regions in the Northern Hemisphere while slowing it down in others. Both scientometrics and legal data studies then feed into the historical analysis of the origins and evolution of international data organizations to better understand if their historical trajectory caters for exacerbating or removing these asymmetries in global data distribution. The final component in the project aims to utilize the scientometrics and historical evidence amassed to then elaborate further on what science policy provisions could seek to counter the widening gap in data distribution that exists at the global level.

The combined historical, legal treaties and scientometrics analyses should thus return a comprehensive picture displaying the characteristics of a problem of vital importance to global society and governance, namely the inequality in the distribution of scientific data resources across the planet. This is a problem with decisive repercussions with regards to international scientific collaborations, and especially what we now call 'science diplomacy', i.e. the use of these collaborations to build constructive relations between nations and address global societal challenges. It is our working hypothesis that an uneven distribution of research data resources can advance science in some world regions while hindering scientific development in others. In turn lack of capacity building of individual research communities can also result in subordinating them to others with overwhelming data capacity. The repercussions within the scientific society and society at large are compelling in that unless these asymmetries are better understood and fully addressed, it is impossible to truly administer the global scientific society in truly transformative ways so as to address global challenges too. It is worth recalling the findings of the 2023 mid-term report released on the implementation of the UN Sustainable Development Goals and emphasizing that not only we are far off track from achieving the 2030 SDG targets, but that in order to do so science and technology capacity building in ALL countries is needed (https://sdgs.un.org/sites/default/files/2023-09/GSDR%202023%20Key%20Messages_1.pdf(se abrirá en una nueva ventana)). As far as data are concerned, this means a comprehensive re-thinking of how data resources have been distributed over the last century and if a different distribution criteria should be developed.
The last two years of collaborative work in the framework of Neworldata have been marked by an effort to research, examine and display these global data asymmetries. The effort has been deliberately incremental in that we always planned ahead to have the first two sub-projects (on scientometrics, on legal treaties) to feed on the third (on historical reconstruction) by highlighting aspects of global data distribution particularly important to further research. We have thus agreed to stagger their starting date so as to allow for this feeding through. The scientometrics and legal treaties studies started first, in 2022, followed by the historical analysis which evolved in its various strands in 2023 and 2024. There will be a fourth sub-project, on policy provisions, starting in 2025 so as to allow to gain as much as possible from the previous three strands in considering potential provisions for greater data distribution across the planet.

The scientometrics study has enabled us to focus from a quantitative viewpoint on the current features of these global data imbalances, especially in terms of current geographical distribution of datasets, access provisions, and elements of unevenness in their use in the context of the global scientific enterprise. Using an internet-based repository of information on research datasets, Re3Data, we have thus consistently mapped these asymmetries, also using social network analysis' methods. The world mapping exercise has returned vital results showing especially how countries in Africa, Latin America and some Asian regions are at the margin of networks displaying access to research data in that they access repositories less frequently and host them more sparingly too. Our ongoing exploration suggests that data asymmetries play a part in shaping the gap in scientific development between what we now call Global North and Global South countries. We have further extended these working hypotheses to the study of treaties enabling exchange of scientific information and data sharing. While this study could not benefit in the same way from a unique repository (utilizing therefore a range of sources), it has mirrored the scientometrics study in showing that Global North countries enjoy a level of research data sharing, and of bi-lateral exchanges with selected Global South countries, far higher than that of Global South countries.

The scientometrics and legal treaties analyses have provided the evidence needed to focus our attention on the historical determinants of these asymmetries in global data distribution. We have thus explored the archives documenting the evolution of international data organizations with the ambition to find out if their histories display transitions that can be seen as setting global data distribution on a path of unevenness. We have focussed in particular on two data organizations operating in the context of the chief international scientific coordination body operating in the 20th century, i.e. the International Council of Scientific Unions (now International Scientific Council). These two organizations are the World Data System (WDS) and the Committee on Data for Science and Technology (CODATA). While ongoing, the study has already revealed current data asymmetries to result from their development as collaborative enterprises with features paving the way to the building of asymmetries. Since 1957 the WDS datasets were hosted only in a selected number of scientifically developed countries, whereas since 1966 CODATA welcomed only a limited number of national representatives mostly from the same countries. In turn we are now seeking to demonstrate that the dynamics of international collaboration in data-sharing can be seen as a key factor in shaping these asymmetries with a few countries with pre-existing scientific infrastructures dominating international collaborative efforts in data sharing.
In the remaining two years and half of the project we aim to go further, and beyond the state of the art, by completing the studies we have undertaken. We believe that displaying and reflecting on data asymmetries that have shaped and still shape the global scientific enterprise challenges the conventional wisdom in several ways. Firstly, it appraises a conventional (but naïve) understanding of the global scientific enterprise as typified by modes of collaborations setting its contributors on equal footing in establishing scientific relations. It envisages instead, through the example of global data distribution, that at least over the last fifty years critical collaborative research projects regarding data have been typified by a few agenda-setters coordinating the development of the global scientific infrastructure, and placing other groups in a subordinate position. It equally challenges the simplistic understanding of the diplomacy qualities of these collaborative scientific enterprises as setting constructive relations empowering everyone in these global scientific networks. It displays instead how the selective control on scientific infrastructures empowers only those who can set the agenda.

In order to do so, we will extend the historical examination to other datasets made available in the context of the United Nations because critical to world affairs such as those in the marine sciences (the data systems IODE of the International Oceanographic Commission) and nuclear energy (the INIS system of the International Atomic Energy Agency). We wish to verify if similar asymmetries exist in the sharing and access of data on human health, human population and animal species. In particular we expect to contribute new research on different datasets such as those produced by the World Health Organization and the International Union for the Conservation of Nature. Contingently we are also examining how the Cold War impacted on these exchange dynamics. Our research appears to show that the Soviet Union was fundamentally complicit in the administration of asymmetrical data networks, being fully integrated in both WDS and CODATA while elaborating and operating a separate data network, Gosstandardt, for socialist countries in a regime of cooperative antagonism. Much remains to be ascertained about the role of the People's Republic of China too which was first isolated and the re-integrated in the 1980s when it actually started playing a pivotal role in the WDS, also becoming a powerful presence in international organizations devoted to research data.

By 2025 both sub-projects on scientometrics and legal treaties will reach an end, while the historical study of data organization which further evolved for another year. Together with the ongoing historical analysis, we expect the historical study to provide a comprehensive picture of the evolution of various interlocked systems of data exchange and to even more accurately explain how this evolution has produced asymmetries. In turn, our final task before ending the project will be to verify what policy provisions can align to the scientometrics and historical evidence unearthed. Here there is an expectation that the study will reveal the need for a 'levelling up' agenda in the global scientific enterprise that goes beyond the simple issue of funds redistribution (for the UK case see for instance: https://assets.publishing.service.gov.uk/media/601c0675d3bf7f70bc2e1f1d/CST_levelling_up_letter.pdf(se abrirá en una nueva ventana)). We envisage also a re-thinking of the role of scientific infrastructures (including data). We are also likely to suggest revising current data provisions placing overwhelming emphasis on access in the recognition that while access to data is indeed a vital issue, it is not the only decisive one. In particular, unless data production and management is more evenly distributed geographically, datasets will always be responsive to the data-intensive research priorities of those groups that take the lead on this production no matter how accessible data are. It is our preliminary understanding that greater re-distribution would in turn set different priorities for the definition of new datasets more aligned to the priorities of countries undergoing processes of scientific development.
Global links between data centers
Mi folleto 0 0